Deep neural networks (DNNs) have been regarded as fundamental tools for many disciplines.
Meanwhile, they are known for their large-scale parameters, high redundancy in weights, and extensive computing resource consumptions, which pose a tremendous challenge to the deployment in real-time applications or on resource-constrained edge devices. To cope with this issue, compressing DNNs for accelerating its inference has drawn extensive interest earlier. In the first part of this talk, we will showcase how vison transformer can be pruned with little performance degradation to meet onboard resource constraints from a large-scale, constrained, multi-objective optimization perspective. EvolutionViT is proven to effectively tradeoff between computational cost and performance under resource constraints, automatically searching for neural architectures while optimizing two conflicting objectives.
Neural architecture search (NAS) aims to automate the architecture design process, which typically requires a great deal of domain knowledge and human ingenuity. Architectures designed with NAS algorithms have outperformed superior manually crafted networks on a variety of tasks. Nonetheless, the success of NAS is often accompanied by a significant investment in computing resources in order to evaluate numerous candidate architectures during the search process. Such prohibitive computational expenses would obviously deter many interested practitioners without vast computing resources. Consequently, as one of the most pressing issues, exploring effective means to speed up the search process has been intensively studied in recent years. As research in neural architecture search advances, a wide variety of accelerating methods continue to emerge, progressively compressing the computational complexity of NAS. Starting from low-fidelity approximations, performance predictors, and one-shot NAS, the search cost for NAS has dropped from thousands of GPU days to a few GPU days or even hours. However, these methods still rely heavily on validated accuracy as a metric to guide architecture search, and they all essentially involve network training, whether for candidate architectures or supernets, which inevitably imposes a sizable computational overhead. Fortunately, of late, zero-cost proxies have emerged that work to predict the architecture performance at initialization based on some theoretical understanding of deep neural networks. Since only a single forward inference/backward propagation is required, they dramatically slash the search cost of NAS further to the order of minutes or seconds with a single GPU. In the second part of this talk, we will discuss a fine-grained perturbation-aware term to measure how well the architecture can distinguish between inputs and their perturbed counter parts. We propose a layer-wise score multiplication approach to combine these two scoring terms, deriving a new proxy, named efficient perturbation-aware distinguishing score (ePADS). Experiments on various NAS spaces and datasets show that ePADS consistently outperforms other zero-cost proxies in terms of both predictive reliability and efficiency.
Gary G. Yen
received his Ph.D. degree in electrical and computer engineering from the University of Notre Dame in 1992. He was a Regents Professor in the School of Electrical and Computer Engineering, Oklahoma State University. He recently joined Sichuan University, College of Computer Science as a Chair Professor. His research interest includes intelligent control, computational intelligence, evolutionary multiobjective optimization, conditional health monitoring, signal processing and their industrial/defense applications.
Gary was an associate editor of the IEEE Transactions on Neural Networks, IEEE Transactions on Evolutionary Computation, IEEE Transactions on Emerging Topics on Computational Intelligence, and IEEE Control Systems Magazine during 1994-1999, and of the IEEE Transactions on Control Systems Technology, IEEE Transactions on Systems, Man and Cybernetics (Parts A and B) and IFAC Journal on Automatica and Mechatronics during 2000-2010. He is currently serving as an associate editor for the IEEE Transactions on Cybernetics and IEEE Transactions on Artificial Intelligence. Gary served as Vice President for the Technical Activities, IEEE Computational Intelligence Society in 2004-2005 and was the founding editor-in-chief of the IEEE Computational Intelligence Magazine, 2006-2009. He was elected to serve as the President of the IEEE Computational Intelligence Society in 2010-2011 and is elected as a Distinguished Lecturer for the term 2012-2014, 2016-2018, 2021-2023, and 2025-2027. He received Regents Distinguished Research Award from OSU in 2009, 2011 Andrew P Sage Best Transactions Paper award from IEEE Systems, Man and Cybernetics Society, 2013 Meritorious Service award from IEEE Computational Intelligence Society and 2014 Lockheed Martin Aeronautics Excellence Teaching award. He is a Fellow of IEEE, IET and IAPR.